CRE/MF Mortgage Rate Model Demo

Setup

# Load API key and secret from environment variables
from dotenv import load_dotenv
load_dotenv()

# ML libraries
import pandas as pd
import xgboost as xgb
from numpy import argmax
from sklearn.metrics import accuracy_score, precision_recall_curve
from sklearn.model_selection import train_test_split

# ValidMind libraries 
from sklearn.metrics import accuracy_score, precision_recall_curve
from sklearn.model_selection import train_test_split
import validmind as vm
from validmind.vm_models.test_context import TestContext

# Plotting libraries 
import matplotlib.pyplot as plt
%matplotlib inline

Load Data

df = pd.read_csv("../datasets/lending_club_loan_rates.csv", sep='\t')
df = df.rename(columns={'Unnamed: 0': 'Date'})
df = df.set_index(pd.to_datetime(df['Date']))
df.drop(["Date"], axis=1, inplace=True)

# Remove diff columns
columns_to_remove = [col for col in df.columns if col.startswith("diff")]
df = df.drop(columns=columns_to_remove)
df.head()
loan_rate_A loan_rate_B loan_rate_C loan_rate_D FEDFUNDS
Date
2007-08-01 7.766667 9.497692 10.947500 12.267000 5.02
2007-09-01 7.841429 9.276667 10.829167 12.436667 4.94
2007-10-01 7.830000 9.433333 10.825926 12.737368 4.76
2007-11-01 7.779091 9.467778 10.967037 12.609444 4.49
2007-12-01 7.695833 9.387500 10.805000 12.478889 4.24

Visual Inspection.

ValidMind Setup

Initialize ValidMind dataset.

vm.init(
  api_host = "http://localhost:3000/api/v1/tracking",
  api_key = "e22b89a6b9c2a27da47cb0a09febc001",
  api_secret = "a61be901b5596e3c528d94231e4a3c504ef0bb803d16815f8dfd6857fac03e57",
  project = "clgo0g0rt0000fjy6ozl9pb69"
)
True
target_variables = ["loan_rate_A", "loan_rate_B", "loan_rate_C", "loan_rate_D"]

vm_dataset = vm.init_dataset(
    dataset=df,
    target_column = target_variables   
)
Pandas dataset detected. Initializing VM Dataset instance...
Inferring dataset types...

Visualize existing test plans.

vm.test_plans.list_plans()

4. Model Development

4.1. Development Data and Platform

4.1.2. Data Quality and Relevance

4.1.3. Data Process, Adjustments and Treatment

A. Missing Values Analysis

Step 1: Calculate the percentage of missing values in each column

Step 2: Display the missing values percentage in a table format

Step 3: Visualize the missing values

### B. Outliers Analysis

Step 1: Visualize the dataset using box plots

Visualize the data using box plots to get an initial sense of the presence of outliers.

Step 2: Calculate Z-scores

Step 3: Set a threshold and identify outliers

Set a threshold (e.g., 3) to identify data points with Z-scores higher than the threshold.

Step 4: Analyze the outliers

Analyze the outliers by looking at their frequency, index, and corresponding column.

### C. Stationarity Analysis

Step 1: Run Unit Root Tests

from validmind.test_plans.statsmodels_timeseries import UnitRoot 
test_context = TestContext(dataset=vm_dataset)
ur_test_plan = UnitRoot(test_context=test_context)
ur_test_plan.run()
Running Metric: kpss:  20%|██        | 1/5 [00:00<00:00, 89.93it/s]  The test statistic is outside of the range of p-values available in the
look-up table. The actual p-value is smaller than the p-value returned.

The test statistic is outside of the range of p-values available in the
look-up table. The actual p-value is greater than the p-value returned.

The test statistic is outside of the range of p-values available in the
look-up table. The actual p-value is smaller than the p-value returned.

The test statistic is outside of the range of p-values available in the
look-up table. The actual p-value is smaller than the p-value returned.

                                                                                                         

Results for unit_root Test Plan:


Logged the following dataset metric to the ValidMind platform:

Metric Name
adf
Metric Type
dataset
Metric Scope
Metric Value
{'loan_rate_A': {'stat': -1.917289312690944, 'pvalue': 0.32397189281015515, 'usedlag': 1, 'nobs': 135, 'critical_values': {'1%': -3.479742586699182, '5%': -2.88319822181578, '10%': -2.578319684499314}, 'icbest': -71.08908853191068}, 'loan_rate_B': {'stat': -3.1599303710498425, 'pvalue': 0.022424413263559147, 'usedlag': 9, 'nobs': 127, 'critical_values': {'1%': -3.482920063655088, '5%': -2.884580323367261, '10%': -2.5790575441750883}, 'icbest': -42.45027033820841}, 'loan_rate_C': {'stat': -2.530666699941385, 'pvalue': 0.10818994357289696, 'usedlag': 1, 'nobs': 135, 'critical_values': {'1%': -3.479742586699182, '5%': -2.88319822181578, '10%': -2.578319684499314}, 'icbest': -92.19465856666866}, 'loan_rate_D': {'stat': -1.617158531178829, 'pvalue': 0.47421928207593467, 'usedlag': 6, 'nobs': 130, 'critical_values': {'1%': -3.4816817173418295, '5%': -2.8840418343195267, '10%': -2.578770059171598}, 'icbest': -4.9426661983780775}, 'FEDFUNDS': {'stat': -0.16854321128256927, 'pvalue': 0.9421687822974046, 'usedlag': 13,...

Logged the following evaluation metric to the ValidMind platform:

Metric Name
kpss
Metric Type
evaluation
Metric Scope
test
Metric Value
{'loan_rate_A': {'stat': 1.012356679488042, 'pvalue': 0.01, 'usedlag': 6, 'critical_values': {'10%': 0.347, '5%': 0.463, '2.5%': 0.574, '1%': 0.739}}, 'loan_rate_B': {'stat': 0.26307336980308743, 'pvalue': 0.1, 'usedlag': 6, 'critical_values': {'10%': 0.347, '5%': 0.463, '2.5%': 0.574, '1%': 0.739}}, 'loan_rate_C': {'stat': 0.8099610581510324, 'pvalue': 0.01, 'usedlag': 6, 'critical_values': {'10%': 0.347, '5%': 0.463, '2.5%': 0.574, '1%': 0.739}}, 'loan_rate_D': {'stat': 1.5641456258111677, 'pvalue': 0.01, 'usedlag': 6, 'critical_values': {'10%': 0.347, '5%': 0.463, '2.5%': 0.574, '1%': 0.739}}, 'FEDFUNDS': {'stat': 0.37574599218114024, 'pvalue': 0.08760948612881886, 'usedlag': 6, 'critical_values': {'10%': 0.347, '5%': 0.463, '2.5%': 0.574, '1%': 0.739}}}

Logged the following evaluation metric to the ValidMind platform:

Metric Name
phillips_perron
Metric Type
evaluation
Metric Scope
test
Metric Value
{'loan_rate_A': {'stat': -2.027461766308415, 'pvalue': 0.2746633686983735, 'usedlag': 13, 'nobs': 136}, 'loan_rate_B': {'stat': -2.4460840425020693, 'pvalue': 0.1291495662057986, 'usedlag': 13, 'nobs': 136}, 'loan_rate_C': {'stat': -2.264992296812233, 'pvalue': 0.18350727828330493, 'usedlag': 13, 'nobs': 136}, 'loan_rate_D': {'stat': -1.8536131757387562, 'pvalue': 0.35418744654020773, 'usedlag': 13, 'nobs': 136}, 'FEDFUNDS': {'stat': -3.970771700595016, 'pvalue': 0.00157153429138061, 'usedlag': 13, 'nobs': 136}}

Logged the following evaluation metric to the ValidMind platform:

Metric Name
zivot_andrews
Metric Type
evaluation
Metric Scope
test
Metric Value
{'loan_rate_A': {'stat': -3.4986701764941164, 'pvalue': 0.6804771233224802, 'usedlag': 1, 'nobs': 137}, 'loan_rate_B': {'stat': -5.24634607418534, 'pvalue': 0.011857344006006859, 'usedlag': 9, 'nobs': 137}, 'loan_rate_C': {'stat': -4.681009266047663, 'pvalue': 0.07413460165891156, 'usedlag': 10, 'nobs': 137}, 'loan_rate_D': {'stat': -4.5661357909991125, 'pvalue': 0.10001302102995041, 'usedlag': 5, 'nobs': 137}, 'FEDFUNDS': {'stat': -4.263682850153341, 'pvalue': 0.20816298878228218, 'usedlag': 13, 'nobs': 137}}

Logged the following evaluation metric to the ValidMind platform:

Metric Name
dickey_fuller_gls
Metric Type
evaluation
Metric Scope
test
Metric Value
{'loan_rate_A': {'stat': -1.79800835619108, 'pvalue': 0.07127713927441981, 'usedlag': 1, 'nobs': 135}, 'loan_rate_B': {'stat': -1.3606975485926263, 'pvalue': 0.16677847263199563, 'usedlag': 9, 'nobs': 127}, 'loan_rate_C': {'stat': -0.4700035237018868, 'pvalue': 0.5125489900593104, 'usedlag': 10, 'nobs': 126}, 'loan_rate_D': {'stat': 0.2675751876186429, 'pvalue': 0.777908700165412, 'usedlag': 5, 'nobs': 131}, 'FEDFUNDS': {'stat': -1.4167584741514279, 'pvalue': 0.15109358273536777, 'usedlag': 13, 'nobs': 123}}

Unit Root Tests with Stationarity Decision.

# Question: Ideally we would like to show results of unit root test plan like this (results hardcoded)
unit_root_test_results = {
    'Series': ['loan_rate_A', 'loan_rate_A', 'loan_rate_A', 'loan_rate_A', 'loan_rate_A', 'loan_rate_B', 'loan_rate_B', 'loan_rate_B', 'loan_rate_B', 'loan_rate_B', 'loan_rate_C', 'loan_rate_C', 'loan_rate_C', 'loan_rate_C', 'loan_rate_C', 'loan_rate_D', 'loan_rate_D', 'loan_rate_D', 'loan_rate_D', 'loan_rate_D', 'FEDFUNDS', 'FEDFUNDS', 'FEDFUNDS', 'FEDFUNDS', 'FEDFUNDS'],
    'Test': ['ADF', 'KPSS', 'Phillips-Perron', 'Zivot-Andrews', 'DFGLS', 'ADF', 'KPSS', 'Phillips-Perron', 'Zivot-Andrews', 'DFGLS', 'ADF', 'KPSS', 'Phillips-Perron', 'Zivot-Andrews', 'DFGLS', 'ADF', 'KPSS', 'Phillips-Perron', 'Zivot-Andrews', 'DFGLS', 'ADF', 'KPSS', 'Phillips-Perron', 'Zivot-Andrews', 'DFGLS'],
    'p-value': [0.323972, 0.010000, 0.274663, 0.680477, 0.071277, 0.022424, 0.100000, 0.129150, 0.011857, 0.166778, 0.108190, 0.010000, 0.183507, 0.074135, 0.512549, 0.474219, 0.010000, 0.354187, 0.100013, 0.777909, 0.942169, 0.087609, 0.001572, 0.208163, 0.151094],
    'Threshold': [0.05] * 25,
    'Pass/Fail': ['Pass', 'Pass', 'Pass', 'Pass', 'Pass', 'Fail', 'Fail', 'Pass', 'Fail', 'Pass', 'Pass', 'Pass', 'Pass', 'Pass', 'Pass', 'Pass', 'Pass', 'Pass', 'Pass', 'Pass', 'Pass', 'Fail', 'Fail', 'Pass', 'Pass'],
    'Decision': ['non-stationary', 'stationary', 'non-stationary', 'non-stationary', 'non-stationary', 'stationary', 'non-stationary', 'non-stationary', 'stationary', 'non-stationary', 'non-stationary', 'stationary', 'non-stationary', 'non-stationary', 'non-stationary', 'non-stationary', 'stationary', 'non-stationary', 'non-stationary', 'non-stationary', 'non-stationary', 'non-stationary', 'stationary', 'non-stationary', 'non-stationary']
}

display(pd.DataFrame(unit_root_test_results))
Series Test p-value Threshold Pass/Fail Decision
0 loan_rate_A ADF 0.323972 0.05 Pass non-stationary
1 loan_rate_A KPSS 0.010000 0.05 Pass stationary
2 loan_rate_A Phillips-Perron 0.274663 0.05 Pass non-stationary
3 loan_rate_A Zivot-Andrews 0.680477 0.05 Pass non-stationary
4 loan_rate_A DFGLS 0.071277 0.05 Pass non-stationary
5 loan_rate_B ADF 0.022424 0.05 Fail stationary
6 loan_rate_B KPSS 0.100000 0.05 Fail non-stationary
7 loan_rate_B Phillips-Perron 0.129150 0.05 Pass non-stationary
8 loan_rate_B Zivot-Andrews 0.011857 0.05 Fail stationary
9 loan_rate_B DFGLS 0.166778 0.05 Pass non-stationary
10 loan_rate_C ADF 0.108190 0.05 Pass non-stationary
11 loan_rate_C KPSS 0.010000 0.05 Pass stationary
12 loan_rate_C Phillips-Perron 0.183507 0.05 Pass non-stationary
13 loan_rate_C Zivot-Andrews 0.074135 0.05 Pass non-stationary
14 loan_rate_C DFGLS 0.512549 0.05 Pass non-stationary
15 loan_rate_D ADF 0.474219 0.05 Pass non-stationary
16 loan_rate_D KPSS 0.010000 0.05 Pass stationary
17 loan_rate_D Phillips-Perron 0.354187 0.05 Pass non-stationary
18 loan_rate_D Zivot-Andrews 0.100013 0.05 Pass non-stationary
19 loan_rate_D DFGLS 0.777909 0.05 Pass non-stationary
20 FEDFUNDS ADF 0.942169 0.05 Pass non-stationary
21 FEDFUNDS KPSS 0.087609 0.05 Fail non-stationary
22 FEDFUNDS Phillips-Perron 0.001572 0.05 Fail stationary
23 FEDFUNDS Zivot-Andrews 0.208163 0.05 Pass non-stationary
24 FEDFUNDS DFGLS 0.151094 0.05 Pass non-stationary

Interpretation of Unit Root Tests.

Step 2: Making Series Stationary

Compute first difference.

diff_df = df.diff().dropna()

Inspect time series.

Step 3: Run Unit Root Tests

# Pass first difference to VM dataset
# Question: I am now overwriting the df, can I log both raw and first diff dataset and use them as required later on? 
vm_dataset = vm.init_dataset(dataset=diff_df)
test_context = TestContext(dataset=vm_dataset)
ur_test_plan = UnitRoot(test_context=test_context)
ur_test_plan.run()
Pandas dataset detected. Initializing VM Dataset instance...
Inferring dataset types...
Running Metric: kpss:  20%|██        | 1/5 [00:00<00:00, 82.73it/s]  The test statistic is outside of the range of p-values available in the
look-up table. The actual p-value is greater than the p-value returned.

The test statistic is outside of the range of p-values available in the
look-up table. The actual p-value is greater than the p-value returned.

The test statistic is outside of the range of p-values available in the
look-up table. The actual p-value is greater than the p-value returned.

The test statistic is outside of the range of p-values available in the
look-up table. The actual p-value is greater than the p-value returned.

The test statistic is outside of the range of p-values available in the
look-up table. The actual p-value is smaller than the p-value returned.

                                                                                                         

Results for unit_root Test Plan:


Logged the following dataset metric to the ValidMind platform:

Metric Name
adf
Metric Type
dataset
Metric Scope
Metric Value
{'loan_rate_A': {'stat': -10.288173999889596, 'pvalue': 3.628391608895891e-18, 'usedlag': 0, 'nobs': 135, 'critical_values': {'1%': -3.479742586699182, '5%': -2.88319822181578, '10%': -2.578319684499314}, 'icbest': -77.77997210900043}, 'loan_rate_B': {'stat': -8.581774306100574, 'pvalue': 7.693931170347862e-14, 'usedlag': 0, 'nobs': 135, 'critical_values': {'1%': -3.479742586699182, '5%': -2.88319822181578, '10%': -2.578319684499314}, 'icbest': -39.517450997610695}, 'loan_rate_C': {'stat': -8.497490203712452, 'pvalue': 1.264293522721377e-13, 'usedlag': 0, 'nobs': 135, 'critical_values': {'1%': -3.479742586699182, '5%': -2.88319822181578, '10%': -2.578319684499314}, 'icbest': -89.85288082240504}, 'loan_rate_D': {'stat': -4.089047184282974, 'pvalue': 0.0010096960791906875, 'usedlag': 5, 'nobs': 130, 'critical_values': {'1%': -3.4816817173418295, '5%': -2.8840418343195267, '10%': -2.578770059171598}, 'icbest': -2.8530273746351327}, 'FEDFUNDS': {'stat': -7.7635076306973865, 'pvalue': 9.31881384691925e-12, 'usedla...

Logged the following evaluation metric to the ValidMind platform:

Metric Name
kpss
Metric Type
evaluation
Metric Scope
test
Metric Value
{'loan_rate_A': {'stat': 0.08690727348877204, 'pvalue': 0.1, 'usedlag': 2, 'critical_values': {'10%': 0.347, '5%': 0.463, '2.5%': 0.574, '1%': 0.739}}, 'loan_rate_B': {'stat': 0.1406230557198794, 'pvalue': 0.1, 'usedlag': 3, 'critical_values': {'10%': 0.347, '5%': 0.463, '2.5%': 0.574, '1%': 0.739}}, 'loan_rate_C': {'stat': 0.302111183227079, 'pvalue': 0.1, 'usedlag': 3, 'critical_values': {'10%': 0.347, '5%': 0.463, '2.5%': 0.574, '1%': 0.739}}, 'loan_rate_D': {'stat': 0.2150982513758707, 'pvalue': 0.1, 'usedlag': 1, 'critical_values': {'10%': 0.347, '5%': 0.463, '2.5%': 0.574, '1%': 0.739}}, 'FEDFUNDS': {'stat': 1.1100238305291938, 'pvalue': 0.01, 'usedlag': 5, 'critical_values': {'10%': 0.347, '5%': 0.463, '2.5%': 0.574, '1%': 0.739}}}

Logged the following evaluation metric to the ValidMind platform:

Metric Name
phillips_perron
Metric Type
evaluation
Metric Scope
test
Metric Value
{'loan_rate_A': {'stat': -10.31157577193113, 'pvalue': 3.175252114879247e-18, 'usedlag': 13, 'nobs': 135}, 'loan_rate_B': {'stat': -8.656685376413767, 'pvalue': 4.9471754291376904e-14, 'usedlag': 13, 'nobs': 135}, 'loan_rate_C': {'stat': -8.968716395365107, 'pvalue': 7.859548454989337e-15, 'usedlag': 13, 'nobs': 135}, 'loan_rate_D': {'stat': -9.359989759533976, 'pvalue': 7.870234405243031e-16, 'usedlag': 13, 'nobs': 135}, 'FEDFUNDS': {'stat': -5.577757930833498, 'pvalue': 1.4193167380557158e-06, 'usedlag': 13, 'nobs': 135}}

Logged the following evaluation metric to the ValidMind platform:

Metric Name
zivot_andrews
Metric Type
evaluation
Metric Scope
test
Metric Value
{'loan_rate_A': {'stat': -10.647554732355527, 'pvalue': 1e-05, 'usedlag': 0, 'nobs': 136}, 'loan_rate_B': {'stat': -8.878000651641708, 'pvalue': 1e-05, 'usedlag': 0, 'nobs': 136}, 'loan_rate_C': {'stat': -9.176489623666406, 'pvalue': 1e-05, 'usedlag': 0, 'nobs': 136}, 'loan_rate_D': {'stat': -4.7362602346101195, 'pvalue': 0.06385037699908429, 'usedlag': 5, 'nobs': 136}, 'FEDFUNDS': {'stat': -8.74079625000198, 'pvalue': 1e-05, 'usedlag': 13, 'nobs': 136}}

Logged the following evaluation metric to the ValidMind platform:

Metric Name
dickey_fuller_gls
Metric Type
evaluation
Metric Scope
test
Metric Value
{'loan_rate_A': {'stat': -9.407796046868018, 'pvalue': 8.038704892435365e-16, 'usedlag': 0, 'nobs': 135}, 'loan_rate_B': {'stat': -6.1286593160527945, 'pvalue': 6.978825245356776e-09, 'usedlag': 0, 'nobs': 135}, 'loan_rate_C': {'stat': -6.7890425662958975, 'pvalue': 2.863415572617712e-10, 'usedlag': 0, 'nobs': 135}, 'loan_rate_D': {'stat': -3.2170698858406395, 'pvalue': 0.0013827003089759758, 'usedlag': 5, 'nobs': 130}, 'FEDFUNDS': {'stat': -3.775689552643456, 'pvalue': 0.0001896667660634073, 'usedlag': 13, 'nobs': 122}}

Unit Root Tests with Stationarity Decision.

Interpretation of Unit Root Tests.

Step 4: Decision

Series is stationary after first difference.

D. Seasonality Analysis

Step 1: Seasonal decomposition

Perform seasonal decomposition on each time series.

from validmind.model_validation.statsmodels.metrics import SeasonalDecompose
test_context = TestContext(train_ds=vm_train_ds)
sd_metric = SeasonalDecompose(test_context=test_context)
NameError: name 'vm_train_ds' is not defined

Step 2: Visualize seasonal decomposition

Create plots for observed, trend, seasonal and residual components.

sd_metric.run()
sd_metric.result.show()

Seasonality Detection using ACF and PACF.

from validmind.model_validation.statsmodels.metrics import SeasonalityDetectionWithACFandPACF
test_context = TestContext(train_ds=vm_train_ds)
acf_metric = SeasonalityDetectionWithACFandPACF(test_context=test_context)
acf_metric.run()
acf_metric.result.show()

Step 3: Residuals Analysis

Residuals series, histogram, Q-Q and ACF plots.

# Comment: How do I pass the residuals of seasonal decomponsition done before using SeasonalDecomposeMetricWithFigure?
from validmind.model_validation.statsmodels.metrics import ResidualsVisualInspection
test_context = TestContext(train_ds=vm_train_ds)
rvi_metric = ResidualsVisualInspection(test_context=test_context)
rvi_metric.run()
rvi_metric.result.show()

Test if Residuals are Normaly Distributed.

# Comment: How do I pass the residuals of seasonal decomponsition done before using SeasonalDecomposeMetricWithFigure?
vm.run_test_plan("normality_test_plan", train_ds=vm_train_ds, test_ds=vm_test_ds)

Test if Residuals are Autocorrelated.

# Comment: How do I pass the residuals of seasonal decomponsition done before using SeasonalDecomposeMetricWithFigure?
vm.run_test_plan("autocorrelation_test_plan", train_ds=vm_train_ds, test_ds=vm_test_ds)

Step 4: Test for seasonality using the Augmented Dickey-Fuller (ADF) test

Step 5: Analyze the seasonality test results

Step 6: Interpret the results

Step 7: Handle seasonality

4.2. Methodology Selection and Development

4.2.4 Variable Analysis

## A. Feature Analysis

A.1. Univariate Analysis

Visual Inspection

A.2 Multivariave Analysis

Visual Inspection

B. Variable Selection

ARIMA Analysis

Step 1: Identify the Integration order (Stationarity Analysis)

Unit Root Tests.

vm.run_test_plan("unit_root_test_plan", train_ds=vm_train_ds, test_ds=vm_test_ds)

Step 2: Identify the AR order

Step 3: Identify the MA order

vm.run_test_plan("normality_test_plan", train_ds=vm_train_ds, test_ds=vm_test_ds)

Run SeasonalDecomposeMetricWithFigure Test

# test_context = TestContext(train_ds=vm_train_ds)
# sd_metric = SeasonalDecomposeMetricWithFigure(test_context=test_context)
# sd_metric.run()

Run ResidualsVisualInspection Test

test_context = TestContext(train_ds=vm_train_ds, test_ds=vm_test_ds)
rvi_test = ResidualsVisualInspection(test_context=test_context)
rvi_test.run()
vm.run_test_plan("seasonality_test_plan", train_ds=vm_train_ds, test_ds=vm_test_ds)
loan_rate_columns = ["loan_rate_A", "loan_rate_B", "loan_rate_C", "loan_rate_D"]
diff1_loan_rate_columns = ["diff1_loan_rate_A", "diff1_loan_rate_B", "diff1_loan_rate_C", "diff1_loan_rate_D"]

test_plan_config = {
    "time_series_univariate_inspection_raw": {
        "columns": loan_rate_columns + diff1_loan_rate_columns
    },
    "time_series_univariate_inspection_histogram": {
        "columns": loan_rate_columns + diff1_loan_rate_columns
    }
}

vm.run_test_plan(
    "timeseries_test_plan",
    config=test_plan_config,    
    test_ds=vm_test_ds,
    train_ds=vm_train_ds,
    dataset=vm_train_ds,
)
Running Metric: time_series_univariate_inspection_raw: 100%|██████████| 2/2 [00:01<00:00,  1.80it/s]      

Results for time_series_univariate_inspection Test Plan:


Logged the following plots to the ValidMind platform:

Metric Plots Show All Plots

Logged the following plots to the ValidMind platform:

Metric Plots Show All Plots
Running Metric: seasonality_detection_with_acf:  50%|█████     | 1/2 [00:02<00:02,  2.48s/it]       The default method 'yw' can produce PACF values outside of the [-1,1] interval. After 0.13, the default will change tounadjusted Yule-Walker ('ywm'). You can use this method now by setting method='ywm'.
Running Metric: runs_test:  67%|██████▋   | 2/3 [00:00<00:00, 61.45it/s]                     

Results for autocorrelation_test_plan Test Plan:


Logged the following evaluation metric to the ValidMind platform:

Metric Name
ljung_box
Metric Type
evaluation
Metric Scope
test
Metric Value
{'loan_rate_A': {'stat': 99.79377801179028, 'pvalue': 1.691207273732272e-23}, 'loan_rate_B': {'stat': 98.47643762617452, 'pvalue': 3.289150569642567e-23}, 'loan_rate_C': {'stat': 102.72120544262356, 'pvalue': 3.8579272650362314e-24}, 'loan_rate_D': {'stat': 102.06119689975492, 'pvalue': 5.383276667018511e-24}, 'FEDFUNDS': {'stat': 92.49617760172814, 'pvalue': 6.745493792730931e-22}, 'diff1_loan_rate_A': {'stat': 0.4402266706066844, 'pvalue': 0.5070130714060204}, 'diff1_loan_rate_B': {'stat': 9.03053378524947, 'pvalue': 0.0026550694107563377}, 'diff1_loan_rate_C': {'stat': 8.105512733213732, 'pvalue': 0.004413083679995846}, 'diff1_loan_rate_D': {'stat': 2.8385839077615724, 'pvalue': 0.09202528239248291}, 'diff1_FEDFUNDS': {'stat': 47.24114990634128, 'pvalue': 6.276875252119377e-12}, 'diff2_FEDFUNDS': {'stat': 2.412848114121581, 'pvalue': 0.12034323868942863}}

Logged the following evaluation metric to the ValidMind platform:

Metric Name
box_pierce
Metric Type
evaluation
Metric Scope
test
Metric Value
{'loan_rate_A': {'stat': 99.79377801179028, 'pvalue': 1.691207273732272e-23}, 'loan_rate_B': {'stat': 98.47643762617452, 'pvalue': 3.289150569642567e-23}, 'loan_rate_C': {'stat': 102.72120544262356, 'pvalue': 3.8579272650362314e-24}, 'loan_rate_D': {'stat': 102.06119689975492, 'pvalue': 5.383276667018511e-24}, 'FEDFUNDS': {'stat': 92.49617760172814, 'pvalue': 6.745493792730931e-22}, 'diff1_loan_rate_A': {'stat': 0.4402266706066844, 'pvalue': 0.5070130714060204}, 'diff1_loan_rate_B': {'stat': 9.03053378524947, 'pvalue': 0.0026550694107563377}, 'diff1_loan_rate_C': {'stat': 8.105512733213732, 'pvalue': 0.004413083679995846}, 'diff1_loan_rate_D': {'stat': 2.8385839077615724, 'pvalue': 0.09202528239248291}, 'diff1_FEDFUNDS': {'stat': 47.24114990634128, 'pvalue': 6.276875252119377e-12}, 'diff2_FEDFUNDS': {'stat': 2.412848114121581, 'pvalue': 0.12034323868942863}}

Logged the following evaluation metric to the ValidMind platform:

Metric Name
runs_test
Metric Type
evaluation
Metric Scope
test
Metric Value
{'loan_rate_A': {'stat': -8.632956969312945, 'pvalue': 5.978593166876075e-18}, 'loan_rate_B': {'stat': -9.225267842093881, 'pvalue': 2.8285309180225434e-20}, 'loan_rate_C': {'stat': -9.408682037015677, 'pvalue': 5.0239675387170514e-21}, 'loan_rate_D': {'stat': -10.196439746918763, 'pvalue': 2.056741863262086e-24}, 'FEDFUNDS': {'stat': -10.094086373225906, 'pvalue': 5.8674887432668504e-24}, 'diff1_loan_rate_A': {'stat': 0.09804933849375877, 'pvalue': 0.9218931156254718}, 'diff1_loan_rate_B': {'stat': -0.6586877630844152, 'pvalue': 0.5100962925233736}, 'diff1_loan_rate_C': {'stat': -2.8948770301798645, 'pvalue': 0.0037930709179304486}, 'diff1_loan_rate_D': {'stat': -0.5392715515560418, 'pvalue': 0.589699495478653}, 'diff1_FEDFUNDS': {'stat': -8.465639721231678, 'pvalue': 2.5475051177682628e-17}, 'diff2_FEDFUNDS': {'stat': 1.8723472602592905, 'pvalue': 0.061158576378083265}}
Running Metric: lilliefors_test:  75%|███████▌  | 3/4 [00:00<00:00, 157.22it/s]    

Results for normality_test_plan Test Plan:


Logged the following evaluation metric to the ValidMind platform:

Metric Name
jarque_bera
Metric Type
evaluation
Metric Scope
test
Metric Value
{'loan_rate_A': {'stat': 2.848172850453067, 'pvalue': 0.24072828607015606, 'skew': 0.3894929136110148, 'kurtosis': 2.821048639289756}, 'loan_rate_B': {'stat': 7.948577904082513, 'pvalue': 0.018792659227110216, 'skew': -0.1678688191978654, 'kurtosis': 1.7076614865010464}, 'loan_rate_C': {'stat': 1.4151119572602828, 'pvalue': 0.4928472559373931, 'skew': -0.2667218461199135, 'kurtosis': 2.818765023550964}, 'loan_rate_D': {'stat': 5.805990427261884, 'pvalue': 0.05485866032647532, 'skew': -0.3686251358879733, 'kurtosis': 2.1289430194058165}, 'FEDFUNDS': {'stat': 351.77897186101717, 'pvalue': 4.094179086914819e-77, 'skew': 2.807862376508006, 'kurtosis': 9.882392761343626}, 'diff1_loan_rate_A': {'stat': 49.69819494996339, 'pvalue': 1.615005799924691e-11, 'skew': -0.292775263452962, 'kurtosis': 6.287003081958673}, 'diff1_loan_rate_B': {'stat': 84.6129819949766, 'pvalue': 4.2317929537986954e-19, 'skew': -0.9659733072874014, 'kurtosis': 6.904637635216992}, 'diff1_loan_rate_C': {'stat': 13.59037152041562, 'pvalue': 0.00...

Logged the following evaluation metric to the ValidMind platform:

Metric Name
kolmogorov_smirnov
Metric Type
evaluation
Metric Scope
test
Metric Value
{'loan_rate_A': {'stat': 0.08132576894776711, 'pvalue': 0.08981725287412488}, 'loan_rate_B': {'stat': 0.12791454984124828, 'pvalue': 0.0009999999999998899}, 'loan_rate_C': {'stat': 0.13865442725902158, 'pvalue': 0.0009999999999998899}, 'loan_rate_D': {'stat': 0.12708648400842287, 'pvalue': 0.0009999999999998899}, 'FEDFUNDS': {'stat': 0.4185270469008754, 'pvalue': 0.0009999999999998899}, 'diff1_loan_rate_A': {'stat': 0.17590867451812156, 'pvalue': 0.0009999999999998899}, 'diff1_loan_rate_B': {'stat': 0.18838563953847634, 'pvalue': 0.0009999999999998899}, 'diff1_loan_rate_C': {'stat': 0.1810182591944418, 'pvalue': 0.0009999999999998899}, 'diff1_loan_rate_D': {'stat': 0.16004784597113098, 'pvalue': 0.0009999999999998899}, 'diff1_FEDFUNDS': {'stat': 0.3742488536123778, 'pvalue': 0.0009999999999998899}, 'diff2_FEDFUNDS': {'stat': 0.28686393808100397, 'pvalue': 0.0009999999999998899}}

Logged the following evaluation metric to the ValidMind platform:

Metric Name
shapiro_wilk
Metric Type
evaluation
Metric Scope
test
Metric Value
{'loan_rate_A': {'stat': 0.975836992263794, 'pvalue': 0.04807629808783531}, 'loan_rate_B': {'stat': 0.9356268644332886, 'pvalue': 6.0137466789456084e-05}, 'loan_rate_C': {'stat': 0.9475300312042236, 'pvalue': 0.0003490431990940124}, 'loan_rate_D': {'stat': 0.9415098428726196, 'pvalue': 0.00014052187907509506}, 'FEDFUNDS': {'stat': 0.4598355293273926, 'pvalue': 4.666416175806635e-18}, 'diff1_loan_rate_A': {'stat': 0.8952228426933289, 'pvalue': 4.106910580503609e-07}, 'diff1_loan_rate_B': {'stat': 0.8786153793334961, 'pvalue': 7.289008863153867e-08}, 'diff1_loan_rate_C': {'stat': 0.9339078068733215, 'pvalue': 4.726010956801474e-05}, 'diff1_loan_rate_D': {'stat': 0.8760568499565125, 'pvalue': 5.656733037540107e-08}, 'diff1_FEDFUNDS': {'stat': 0.4879304766654968, 'pvalue': 1.3033836108341916e-17}, 'diff2_FEDFUNDS': {'stat': 0.59507155418396, 'pvalue': 1.008704643589071e-15}}

Logged the following evaluation metric to the ValidMind platform:

Metric Name
lilliefors_test
Metric Type
evaluation
Metric Scope
test
Metric Value
{'loan_rate_A': {'stat': 0.08132576894776711, 'pvalue': 0.08981725287412488}, 'loan_rate_B': {'stat': 0.12791454984124828, 'pvalue': 0.0009999999999998899}, 'loan_rate_C': {'stat': 0.13865442725902158, 'pvalue': 0.0009999999999998899}, 'loan_rate_D': {'stat': 0.12708648400842287, 'pvalue': 0.0009999999999998899}, 'FEDFUNDS': {'stat': 0.4185270469008754, 'pvalue': 0.0009999999999998899}, 'diff1_loan_rate_A': {'stat': 0.17590867451812156, 'pvalue': 0.0009999999999998899}, 'diff1_loan_rate_B': {'stat': 0.18838563953847634, 'pvalue': 0.0009999999999998899}, 'diff1_loan_rate_C': {'stat': 0.1810182591944418, 'pvalue': 0.0009999999999998899}, 'diff1_loan_rate_D': {'stat': 0.16004784597113098, 'pvalue': 0.0009999999999998899}, 'diff1_FEDFUNDS': {'stat': 0.3742488536123778, 'pvalue': 0.0009999999999998899}, 'diff2_FEDFUNDS': {'stat': 0.28686393808100397, 'pvalue': 0.0009999999999998899}}

Results for residuals_test_plan Test Plan:


Logged the following plots to the ValidMind platform:

Metric Plots Show All Plots

Results for seasonality_test_plan Test Plan:


Logged the following evaluation metric to the ValidMind platform:

Metric Name
seasonal_decompose
Metric Type
evaluation
Metric Scope
Metric Value
{'loan_rate_A': [{'Date': '2007-08-01', 'loan_rate_A': 7.7666666666666675, 'trend': nan, 'seasonal': -0.050284773390468364, 'resid': nan}, {'Date': '2007-09-01', 'loan_rate_A': 7.841428571428572, 'trend': nan, 'seasonal': -0.06087962072801926, 'resid': nan}, {'Date': '2007-10-01', 'loan_rate_A': 7.83, 'trend': nan, 'seasonal': 0.01749661199350169, 'resid': nan}, {'Date': '2007-11-01', 'loan_rate_A': 7.779090909090908, 'trend': nan, 'seasonal': -0.047258378330469565, 'resid': nan}, {'Date': '2007-12-01', 'loan_rate_A': 7.695833333333333, 'trend': nan, 'seasonal': 0.08505178146324885, 'resid': nan}, {'Date': '2008-01-01', 'loan_rate_A': 7.961333333333333, 'trend': nan, 'seasonal': 0.06564185816692848, 'resid': nan}, {'Date': '2008-02-01', 'loan_rate_A': 8.130333333333333, 'trend': 8.005048767959094, 'seasonal': 0.008943337297934253, 'resid': 0.11634122807630445}, {'Date': '2008-03-01', 'loan_rate_A': 8.126285714285714, 'trend': 8.036669799705125, 'seasonal': -0.002099404440702811, 'resid': 0.09171531902129207},...
Metric Plots Show All Plots

Logged the following evaluation metric to the ValidMind platform:

Metric Name
seasonality_detection_with_acf
Metric Type
evaluation
Metric Scope
Metric Value
{'loan_rate_A': {'acf_values': array([ 1.        ,  0.95235645,  0.89628237,  0.83365556,  0.78298876,
        0.73759763,  0.68398844,  0.62657655,  0.56468994,  0.50331047,
        0.42183716,  0.34023758,  0.25859584,  0.18191055,  0.11034449,
        0.04315845, -0.0184955 , -0.0788836 , -0.12609771, -0.17004327,
       -0.21018971, -0.24498837, -0.28174988, -0.31377348, -0.33500256,
       -0.3343806 , -0.32206274, -0.30733837, -0.29602269, -0.28022519,
       -0.26123446, -0.23725676, -0.22539932, -0.20575443, -0.18358424,
       -0.16270934, -0.13889279, -0.11922772, -0.09506086, -0.06371048,
       -0.03221894]), 'pacf_values': array([ 1.00000000e+00,  9.61340944e-01, -1.42725992e-01, -1.13803080e-01,
        1.45601541e-01,  1.12916226e-02, -2.01201298e-01, -4.90737660e-02,
       -4.91281155e-02, -7.22772152e-02, -4.00627990e-01, -1.93018777e-02,
       -5.08939592e-03, -1.57441937e-01, -1.01953152e-01,  9.65840221e-02,
        2.09312300e-02, -1.34464258e-01,  1.90609883e-01,  1.59998039e-01,
     ...
Metric Plots Show All Plots
Running Metric: kpss:  20%|██        | 1/5 [00:00<00:00, 46.83it/s]                   The test statistic is outside of the range of p-values available in the
look-up table. The actual p-value is greater than the p-value returned.

The test statistic is outside of the range of p-values available in the
look-up table. The actual p-value is smaller than the p-value returned.

The test statistic is outside of the range of p-values available in the
look-up table. The actual p-value is smaller than the p-value returned.

The test statistic is outside of the range of p-values available in the
look-up table. The actual p-value is smaller than the p-value returned.

The test statistic is outside of the range of p-values available in the
look-up table. The actual p-value is greater than the p-value returned.

The test statistic is outside of the range of p-values available in the
look-up table. The actual p-value is greater than the p-value returned.

The test statistic is outside of the range of p-values available in the
look-up table. The actual p-value is greater than the p-value returned.

The test statistic is outside of the range of p-values available in the
look-up table. The actual p-value is smaller than the p-value returned.

The test statistic is outside of the range of p-values available in the
look-up table. The actual p-value is greater than the p-value returned.

Running Metric: dickey_fuller_gls:  80%|████████  | 4/5 [00:00<00:00,  7.54it/s]

Results for unit_root_test_plan Test Plan:


Logged the following evaluation metric to the ValidMind platform:

Metric Name
adf
Metric Type
evaluation
Metric Scope
test
Metric Value
{'loan_rate_A': {'stat': -1.4553257254174512, 'pvalue': 0.5554840752939311, 'usedlag': 0, 'nobs': 106, 'critical_values': {'1%': -3.4936021509366793, '5%': -2.8892174239808703, '10%': -2.58153320754717}, 'icbest': -38.45407769604711}, 'loan_rate_B': {'stat': -2.4400180904237536, 'pvalue': 0.13075859422992442, 'usedlag': 1, 'nobs': 105, 'critical_values': {'1%': -3.4942202045135513, '5%': -2.889485291005291, '10%': -2.5816762131519275}, 'icbest': -10.155605078660443}, 'loan_rate_C': {'stat': -2.434500294638295, 'pvalue': 0.13223473262843088, 'usedlag': 1, 'nobs': 105, 'critical_values': {'1%': -3.4942202045135513, '5%': -2.889485291005291, '10%': -2.5816762131519275}, 'icbest': -52.03282764335805}, 'loan_rate_D': {'stat': -1.4513637333099165, 'pvalue': 0.5574193301855702, 'usedlag': 5, 'nobs': 101, 'critical_values': {'1%': -3.4968181663902103, '5%': -2.8906107514600103, '10%': -2.5822770483285953}, 'icbest': 3.7868558503568295}, 'FEDFUNDS': {'stat': -8.197180166834837, 'pvalue': 7.395346590172887e-13, 'usedla...

Logged the following evaluation metric to the ValidMind platform:

Metric Name
kpss
Metric Type
evaluation
Metric Scope
test
Metric Value
{'loan_rate_A': {'stat': 0.7103649122778128, 'pvalue': 0.012603189792926104, 'usedlag': 6, 'critical_values': {'10%': 0.347, '5%': 0.463, '2.5%': 0.574, '1%': 0.739}}, 'loan_rate_B': {'stat': 0.2842921343111868, 'pvalue': 0.1, 'usedlag': 5, 'critical_values': {'10%': 0.347, '5%': 0.463, '2.5%': 0.574, '1%': 0.739}}, 'loan_rate_C': {'stat': 0.768398920460664, 'pvalue': 0.01, 'usedlag': 6, 'critical_values': {'10%': 0.347, '5%': 0.463, '2.5%': 0.574, '1%': 0.739}}, 'loan_rate_D': {'stat': 1.1834573394566732, 'pvalue': 0.01, 'usedlag': 6, 'critical_values': {'10%': 0.347, '5%': 0.463, '2.5%': 0.574, '1%': 0.739}}, 'FEDFUNDS': {'stat': 0.7605770293335444, 'pvalue': 0.01, 'usedlag': 5, 'critical_values': {'10%': 0.347, '5%': 0.463, '2.5%': 0.574, '1%': 0.739}}, 'diff1_loan_rate_A': {'stat': 0.08184287789137065, 'pvalue': 0.1, 'usedlag': 1, 'critical_values': {'10%': 0.347, '5%': 0.463, '2.5%': 0.574, '1%': 0.739}}, 'diff1_loan_rate_B': {'stat': 0.2765284129153882, 'pvalue': 0.1, 'usedlag': 3, 'critical_values': {'...

Logged the following evaluation metric to the ValidMind platform:

Metric Name
phillips_perron
Metric Type
evaluation
Metric Scope
test
Metric Value
{'loan_rate_A': {'stat': -1.7947706466647726, 'pvalue': 0.3830609957530101, 'usedlag': 13, 'nobs': 106}, 'loan_rate_B': {'stat': -2.1487566941049545, 'pvalue': 0.2253663008315427, 'usedlag': 13, 'nobs': 106}, 'loan_rate_C': {'stat': -2.1885847453987886, 'pvalue': 0.2104145920459592, 'usedlag': 13, 'nobs': 106}, 'loan_rate_D': {'stat': -1.8557518683994831, 'pvalue': 0.3531543436028597, 'usedlag': 13, 'nobs': 106}, 'FEDFUNDS': {'stat': -6.869103205314763, 'pvalue': 1.5318805564868218e-09, 'usedlag': 13, 'nobs': 106}, 'diff1_loan_rate_A': {'stat': -9.554804673411917, 'pvalue': 2.515553684238378e-16, 'usedlag': 13, 'nobs': 106}, 'diff1_loan_rate_B': {'stat': -7.55192025294947, 'pvalue': 3.172220787230908e-11, 'usedlag': 13, 'nobs': 106}, 'diff1_loan_rate_C': {'stat': -8.203322014389412, 'pvalue': 7.13347610412015e-13, 'usedlag': 13, 'nobs': 106}, 'diff1_loan_rate_D': {'stat': -9.068469476799125, 'pvalue': 4.367152952420752e-15, 'usedlag': 13, 'nobs': 106}, 'diff1_FEDFUNDS': {'stat': -4.857892660681701, 'pvalue': ...

Logged the following evaluation metric to the ValidMind platform:

Metric Name
zivot_andrews
Metric Type
evaluation
Metric Scope
test
Metric Value
{'loan_rate_A': {'stat': -3.2237630693394217, 'pvalue': 0.8298567955055753, 'usedlag': 0, 'nobs': 107}, 'loan_rate_B': {'stat': -3.6338697727907947, 'pvalue': 0.5918309323718615, 'usedlag': 1, 'nobs': 107}, 'loan_rate_C': {'stat': -4.032048442156803, 'pvalue': 0.33266966738846426, 'usedlag': 1, 'nobs': 107}, 'loan_rate_D': {'stat': -4.563937525734853, 'pvalue': 0.10066048370203434, 'usedlag': 5, 'nobs': 107}, 'FEDFUNDS': {'stat': -11.575725821429458, 'pvalue': 1e-05, 'usedlag': 13, 'nobs': 107}, 'diff1_loan_rate_A': {'stat': -10.167793752342952, 'pvalue': 1e-05, 'usedlag': 0, 'nobs': 107}, 'diff1_loan_rate_B': {'stat': -8.086030498466938, 'pvalue': 1e-05, 'usedlag': 0, 'nobs': 107}, 'diff1_loan_rate_C': {'stat': -8.281520085893655, 'pvalue': 1e-05, 'usedlag': 0, 'nobs': 107}, 'diff1_loan_rate_D': {'stat': -3.4961674812109163, 'pvalue': 0.6820955243074778, 'usedlag': 4, 'nobs': 107}, 'diff1_FEDFUNDS': {'stat': -10.26868863233728, 'pvalue': 1e-05, 'usedlag': 12, 'nobs': 107}, 'diff2_FEDFUNDS': {'stat': -4.95177...

Logged the following evaluation metric to the ValidMind platform:

Metric Name
dickey_fuller_gls
Metric Type
evaluation
Metric Scope
test
Metric Value
{'loan_rate_A': {'stat': -1.4055918278417512, 'pvalue': 0.15413260012459246, 'usedlag': 0, 'nobs': 106}, 'loan_rate_B': {'stat': -1.2801714778787454, 'pvalue': 0.191186947030846, 'usedlag': 1, 'nobs': 105}, 'loan_rate_C': {'stat': -0.8536809092685893, 'pvalue': 0.35535587002032143, 'usedlag': 10, 'nobs': 96}, 'loan_rate_D': {'stat': 0.11546012339375353, 'pvalue': 0.7320016245444624, 'usedlag': 5, 'nobs': 101}, 'FEDFUNDS': {'stat': -0.3093806333771481, 'pvalue': 0.5779054549583083, 'usedlag': 13, 'nobs': 93}, 'diff1_loan_rate_A': {'stat': -8.925940939377956, 'pvalue': 7.99387402325497e-15, 'usedlag': 0, 'nobs': 106}, 'diff1_loan_rate_B': {'stat': -6.805919792874811, 'pvalue': 2.636597449793276e-10, 'usedlag': 0, 'nobs': 106}, 'diff1_loan_rate_C': {'stat': -5.881956694764737, 'pvalue': 2.2475055308107982e-08, 'usedlag': 0, 'nobs': 106}, 'diff1_loan_rate_D': {'stat': -0.7530745018244421, 'pvalue': 0.40112408887330814, 'usedlag': 4, 'nobs': 102}, 'diff1_FEDFUNDS': {'stat': -0.66942031434376, 'pvalue': 0.440491176...